METHOD FOR DETERMINING THE SQUARE ROOT OF A 
LONG-BIT NUMBER USING A SHORT-BIT PROCESSOR 

BACKGROUND OF THE INVENTION 
1 . Field of the Invention 

5 The present invention relates to a method for determining the 

square root of a number using a processor, more particularly, to a 
method using a short-bit processor, such as an 8-bit processor, to 
determine the square root of a long-bit number, such as a 1 6-bit or 
32-bit number. 

10 2 . Description of Related Art 

In applications using an 8-bit microprocessor, such as Intel 805 1 , 
it is very common to execute an operation to determine the square root 
of a number. Typically, the square root of a number is obtained by 
using a Euclid's algorithm or a bisection method. However, the use of 

15 the Euclid's algorithm is time-consuming because multiple data 

shifting operations are required. As to the bisection method, due to the 
fact that the data width of the processor is short-bited, as 8-bit, this 
short-bit data has to be rearranged to long-bited, as 1 6-bit, data for 
being processed, and thus the operation is also time-consuming. 

20 Particularly, in some real-time applications, such as CD/DVD jump 
track calculation, the system performance is likely to be degraded 
significantly in using these conventional methods. 

In order to solve the aforementioned problem, a lookup table and 
its due interpolation method can be employed. The use of such a 

25 lookup table is direct and requires no heavy operation step, and the 
result obtained by using the lookup table is acceptable. However, the 



use of lookup table encounters a problem occupying a large memory 
space. Particularly, in the CD/DVD drive tracking application, there 
need three lookup tables for the general CD, single layer DVD and 
double layer DVD, and thus the memory space required is 

5 considerable. Accordingly, there is a desire to have a novel method to 
mitigate and/or obviate the aforementioned problems. 
SUMMARY OF THE INVENTION 

The object of the present invention is to provide a method capable 
of determining the square root of a long-bit number by using a short- 

10 bit processor. 

In accordance with one aspect of the present invention, the 
method of the present invention comprises the steps of: (A) assuming 
the long-bit number to be c><2 2K +d, where c, d < 2 2k , and its square-root 
solution to be (a><2 K +b); (B) finding 'a' by using a bisection method to 

15 obtain the floor value of the square root of 'c'; (C) rearranging and 
transforming the equations in step (A) to obtain a successive 
substitution equation: b [n] = (c-a 2 )x2 2k +(d-b [n . 1] 2 ) /2 2(k+1) ; and (D) giving 
an initial value to 'b' to execute the successive substitution equation 
recursively for several times until the equation is convergent, thereby 

20 finding 'b'. 

In accordance with another aspect of the present invention, the 
method of the present invention comprises the steps of: (A) assuming 
the long-bit number to be c><2 2K +d, where c, d < 2 2k , and its square-root 
solution to be (a*2 K +b); (B) determining the solution by respectively 

25 finding the value of 'a' and 'b'; (C) finding 'a' by taking the floor 
value of the square root of V; (D) rearranging and transforming the 

2 



equations in step (A) to obtain a successive substitution equation: b [n p 
(c-a 2 )x2 2k +(d-b [n . 1] 2 ) / 2 2(k+I) ; and (E) giving an initial value to 'b' to 
execute the successive substitution equation recursively for several 
times until the equation is convergent, thereby finding 'b\ 
5 Other objects, advantages, and novel features of the invention will 

become more apparent from the following detailed description when 
taken in conjunction with the accompanying drawings. 
B RIEF DESCRIPTION OF THE PR AWTNOS 

FIG. 1 is a flowchart of the method for determining the square root 
10 of a long-bit number using a short-bit processor in accordance with the 
present invention. 

DETAILED DESCRIPTION OF TH E PREFERRED 
E M BODIMENT 

In the method for determining the square root of a long-bit number 
1 5 using a short-bit processor, it is assumed that the long-bit number to be 
square-root extracted is: 

c*2 2K +d, where c, d < 2 2k . 
Based on cx2 2K +d = (ax2 K +b) 2 , it is desired to find 'a' and 'b' (a, b < 
2 2k ) as follows: 
20 By expanding (a*2 K +b) 2 , we have: 

(ax2 K +b) 2 =(a 2 x2 2k + a 2 x2 2(k+1 > b+b 2 )= C x2 2K +d. (1) 
Due to b<2 k , we have: 

cx2 2K < (ax2 K +b) 2 < ((a+l)x2 K ) 2 . (2) 
From the equations (1) and (2), we have a 2 < c and c < (a+1) 2 , 
25 respectively. Therefore, we have: a 2 < c < (a+1) 2 , which implies that 



a< a/c <a+l . As a result, the value of 'a' can be determined by taking 

the floor value of the square root of 'c'; that is: 

a = floor(VO. (3) 
In equation (3) to determine the square root of 'c', the bisection 
5 method can be used to find a maximum value of 'a' that satisfies the 

condition of a 2 <c. Therefore, the 'a' in the desired square root is 

found. 

In order to find the 'b' in the desired square root, the equation (1) 
is rearranged as follows: 
10 b=( cx2 2K +d-a 2 x2 2k -b 2 ) / 2 2 < k+1) 

=(c-a 2 )x2 2k +(d-b 2 ) / 2 2(k+1) . (4) 
The equation (4) is transformed into a successive substitution formula 
as follows: 

b [nl = (c- a 2 )x2 2k +(d-b [nA] 2 ) 1 2 2( ~ k+] \ (5) 
15 In the successive substitution process, 'b' is first assumed to be 0 and 
applied into equation (5) to find the value of b [0] . With the value of this 
b [0 j, the successive substitution process is recursively proceeded. In 
practice, the equation (4) is convergent after two recursive 
substitutions, and the resultant value is the actual value of 'b'. In a 
20 digital simulation, it is known that only three substitutions are required 

to find the square root of a 32-bit number. 

FIG. 1 shows the flowchart of the method for determining the 
square root of a long-bit number using a short-bit processor, as being 
derived by the aforementioned equations. Based on the above deriving 
25 process, the flowchart is provided to find the square root of a long-bit 



number by respectively determining the value of a' and 'b'. The value 
of 'a' is first determined by taking the floor value of the square root of 
'b' (step 11). Then, steps 12-16 are performed to determine the value 
of 'b\ In step 12, a loop number 'n' is set to m, where m=3 for a 32-bit 
5 long-bit number. In step 13, 'b' is initialized to 0. In step 14, the initial 
value is applied to the successive substitution formula for determining 
the value of 'b'. Next, 'n' is decreased by 1 (step 15), and if 'n' is not 
equal to 0 (step 16), the successive substitution formula is executed 
repeatedly. After three recursive processes, the execution flow is 
10 terminated, and the obtained values of 'a' and 'b' are the solution of 
the square root. 

In case of using an 8-bit processor to find the square root of a 32- 
bit number, the square root of 'c' in step 1 1 can be determined by 
finding a maximum value of 'a' that satisfies the condition of a 2 <c, 

1 5 using the bisection mehtod. Therefore, the result can be obtained 

quickly because only one multiplication cycle is required. Furthermore, 
whenever the division process in step 14 is performed, only one 
division cycle for dividing a 16-bit number by an 8-bit number is 
required. In the previous example, step 14 is executed for three times, 

20 and thus there are only four execution cycles required in total to find 
the square root of a 32-bit number. The following table 1 shows a 
comparison of the present method and the conventional methods: 



5 



Table 1 



method 


multiplication 


division 


total # of 
execution 


present 
invention 


1 execution 

^oDlLXoDltj 


1 execution 
for 3 loops 
(16bit/8bit) 


1+3=4 


Euclid's 
algorithm 




2 executions 
for 2 loops 
(32bit/16bit) 


4 


bisection 
method 


4 executions 
for 2 loops 
(16bitxl6bit) 
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As depicted in the table, in using the conventional bisection 



method, the 16-bitx 16-bit multiplication is required, and there are two 
loops executed, each loop having four execution cycles (a total number 
5 of eight execution cycles are required). In using the Euclid' s algorithm, 
the 32-bit/l 6-bit division is required, and there are two loops executed, 
each loop having two execution cycles (a total number of four 
execution cycles are required). In this example, the efficiency of the 
Euclid's algorithm is similar to that of the present invention. However, 

10 when being used in the real-time CD/DVD tracking application, the 
number of loops for the Euclid's algorithm is larger than 2, whereas 
that for the present invention is kept to be 3. Therefore, it is known that 
the method of the present invention is better in decreasing the 
execution time. In practice, when being applied to the 805 1 processor, 

1 5 the method of the present invention can reduce the processing time by 



20 percent. 

Although the present invention has been explained in relation to 
its preferred embodiment, it is to be understood that many other 
possible modifications and variations can be made without departing 
5 from the spirit and scope of the invention as hereinafter claimed. 
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